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ABSTRACT 

On large scales galaxies and their halos are usually assumed to trace the dark matter 
with a constant bias and dark matter is assumed to trace the linear density field. 
We test these assumption using several large N-body simulations with 384"^ — 1024'^ 
particles and box sizes of 96 — 1152/i~^Mpc, which can both resolve the small galactic 
size halos and sample the large scale fluctuations. We explore the average halo bias 
relation as a function of halo mass and show that existing fitting formulae overesti- 
mate the halo bias by up to 20% in the regime just below the nonlinear mass. We 
propose a new expression that fits our simulations well. We find that the halo bias 
is nearly constant, b ~ 0.65 — 0.7, for masses below one tenth of the nonlinear mass. 
We explore next the relation between the initial and final dark matter in individual 
Fourier modes and show that there are significant fluctuations in their ratio, ranging 
from 10% rms at A; - 0.03h/Mpc to 50% rms at fc O.lh/Mpc. We argue that these 
large fluctuations are caused by perturbative effects beyond the linear theory, which 
are dominated by long wavelength modes with large random fluctuations. Similar or 
larger fluctuations exist between halos and dark matter and between halos of different 
mass. These fluctuations must be included in attempts to determine the relative bias 
of two populations from their maps, which would otherwise be immune to sampling 
variance. 



1 INTRODUCTION 



Determination of the power spectrum of mass fluctuations 
and its redshift evolution is one of the main goals of modern 
observational cosmology. Its accurate measurement would 
allow us to test some of the most fundamental questions 
in cosmology today, such as the shape of primordial power 
spectrum and its relation to fundamental theories of struc- 
ture formation, the mass of neutrino and the nature of dark 
energy. 

In general there are two approaches to the measurement 
of the matter power spectrum. One is to measure galax- 
ies, either in redshift space or in angular position (perhaps 
supplemented by photometric redshift information) and to 
assume they trace the dark matter. This assumption is be- 
lieved to be valid on large scales, where the so called linear 
bias model assumes that the galaxy density field is propor- 
tional to the matter density field times a free parameter 
called bias. While power spectrum measurements of galaxies 



with modern surveys suc h as SDSS iTeg mark ct al. 2003) or 
2dF iPercival et al.ll20o3) have enormous statistical power, 
they can only determine the shape of the matter power spec- 
trum and not its amplitude because of the bias uncertainties. 
This limits their use in the study of the growth factor evo- 
lution, important for investigations of dark energy models. 
In addition, on small scales information from galaxy clus- 
tering is limited by the uncertainties in the relation between 
the galaxies and the dark matter, which make the bias scale 



dependent. For this reason the small scale information is 
usually discarded. 

The situation with galaxies would not be as dimm if 
we could determine the bias. Here we explore a method to 
determine galaxy bias based on determination of clustering 
amplitude of faint galaxies. These are likely to occupy low 
mass halos which, as we show in this paper, have a well 
determined large scale bias that is nearly independent of 
halo mass. While some fraction of these galaxies are satellites 
in larger halos, this can be quantified and corrected for (see 
? for a first application of this method to the real data). In 
addition, there exist populations, such as IRAS galaxies, for 
which this fraction may be small. In these cases measuring 
the large scale power spectrum amplitude for these galaxies 
determines the matter power spectrum amplitude as well. 

The main problem with using faint galaxies as tracers of 
large scale structure is that in a typical flux limited survey 
faint galaxies occupy a small nearby volume, so the sam- 
pling variance errors for power spectrum on large scales are 
large. However, we can still determine their bias relative to 
a population of brighter galaxies. If there is no stochastic- 
ity between the two populations then a direct comparison of 
the maps gives an accurate determination of the relative bias 
with no sampling variance. We can then use the power spec- 
trum determination of the brighter population, with smaller 
sampling variance errors because of larger volume covered, 
to determine the power spectrum of the fainter population 
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and of the matter itself. The hmiting source of noise is the 
stochasticity between these fields, which we explore in this 
paper. 

Another approach to determine matter fluctuations is to 
use weak lensing induced correlations between background 
galaxy ellipticities. These are sensitive to the dark matter 
fluctuations directly and as such this approach holds the 
promise to improve upon the limitations of the galaxy clus- 
tering methods. Its main limitation is that it traces the dark 
matter in angular projection and has large sampling variance 
errors on large scales. This limits the statistical power of the 
weak lensing surveys. On small scales the nonlinear correc- 
tions, noise, intrinsic correlations and other systematic con- 
taminations become significant, all of which may complicate 
the modelling. 

A possible approach to achieve the best of both worlds 
is to combine the weak lensing and galaxy clustering sur- 
veys: one can use the weak lensing to determine the galaxy 
bias and then use the 3-d galaxy clustering information to 
improve on the statistical errors. One way to do this is to 
use galaxy-dark matter cross-correlation analysis from weak 
lensing and combine it with the galaxy auto-correlation anal- 
ysis. If galaxies are tracing perfectly the dark matter then 
it suffices to have a few well measured modes in both fields 
to determine the galaxy bias. This has been proposed as a 
way to get around the sampling variance in weak lensing 
surveys iPenl l200^ . In the absence of stochasticity it gives 
the bias (and so the dark matter power spectrum) without 
the usual sampling variance errors, assuming the analysis 
is done on the same patch of the sky and with the correct 
radial weighting of the galaxies to match that of the dark 
matter. 

In both of these cases the underlying assumption is that 
there is no stochasticity between these fields on large scales. 
While there have been analytic attempts to address this as- 
sumption (Matsubara 19 99^ , it has not been tested well with 
simulations in the past due to the lack of sufficient dynamic 
range (but see ? for a related study). One must resolve the 
halos small enough to be suitable as galaxy hosts (with typ- 
ical masses at or below 10^^ Mq). At the same time, the 
simulations must be large enough so that many long wave- 
length modes are sampled to determine the statistics of in- 
terest. We achieve this by using a set of new simulations with 
a larger dynamical range. The number of particles in these 
simulations, 10* - lO'', and their box size, 100-1000/i"^Mpc, 
allow a much better exploration of the halo bias and stochas- 
ticity on scales larger than available before. 

In addition to exploring the relation between halos and 
matter we can also investigate the relation between the ini- 
tial and final matter distribution. Weak lensing measures the 
nonlinear matter field, while for the study of linear growth 
factor one would like to know the relation between galaxies 
and linear matter field instead. We explore the relation be- 
tween the final and initial dark matter density field on large 
scales, where this relation is believed to be perfect. This case 
is amenable to perturbation theory analysis and as such al- 
lows one to interpret and verify the numerical simulation 
results. 

Finally, we revisit the question of halo bias as a func- 
tion of halo meiss with the new simulations. This has been 
addressed by previous generation of simulations using 256^ 
particles and box sizes of order (100-140)/!."^Mpc l|jinJl998l: 



ISheth fc Tormenlll999l : fSheth et alJl200j) . This, as noted by 
the authors themselves, is barely adequate for this purpose 
because of large shot noise at high halo masses and insuf- 
ficient number of large scale modes where linear evolution 
is valid. The goal of these papers was to provide expres- 
sions which fit over a range of power law simulations and 
were not specifically optimized for realistic ACDM models. 
They provide expressions that fit the simulations available 
at the time to a reasonable accuracy, but which can be sys- 
tematically wrong by as much as 10-20%. To put things 
into a current context, the statistical error on the ampli- 
tude of galaxy clustering from SDSS using k < 0.2/i/Mpc 
modes is 1% (Togmark ct al. 2003), so a perfect bias deter- 
mination, for example using the faint galaxies as described 
above, would allow us to reach this accuracy on the matter 
power spectrum. In this era 20% accuracy no longer suffices 
and the goal of the present paper is to provide more ac- 
curate expressions for halo bias as a function of mass and 
cosmological model. 



2 SIMULATIONS 

The N-body code we use in this paper is the Hashed Oct-tree 
code (HOT), a parallel, tree based, code {Warren & Salmo^ 
Il993h . This cod e was compared to a variety of other sim- 
ulation codes in lFrenk et alj (^9^, and further validation 
studies will be presented elsewhere. For this paper we per- 
formed several simulations with this code. The smallest was 
a 96/i"^Mpc box size, 512^ particle run (HOTl). This sim- 
ulation has a particle mass of 5.5 x 10^h~^ Mq and is use- 
ful for probing the halo bias at the low mass end, below 
10^^ ^~^Mq. It suffers from the small box size which makes 
the investigations on linear scales difficult and makes the 
shot noise fluctuations for higher mass halos (with lower 
halo numbers) very large. Next up in size is a simulation with 
288/i"^Mpc box and 768^ particles (HOT2). This is the main 
simulation that we use in this paper, as it has an optimal 
combination of box size and particle mass for our purposes. 
It samples the Fourier modes down to ~ 0.02/i/Mpc and 
has many modes at ~ 0.1/i/Mpc, where the power spec- 
trum is still close to linear. This is also the typical scale 
probed by the current surveys such as SDSS and 2dF. The 
particle mass for this simulation is 4.4 x 10^ Mq and 
can resolve halos down to a few times 10^^ Mq, which 
is sufficient for typical galaxies in a flux limited sample. Fi- 
nally, for determination of the halo bias at the high mass 
end we use a simulation with 1152/i~^Mpc box size and 768'' 
particles (H0T3). This simulation has large enough box to 
sample long wavelength modes well, but its particle mass of 
2.8 X 10^^ h^^ Mq does not allow us to resolve galactic size 
halos and we limit its use to group and cluster size halos. 

The tree-code accuracy was controlled using the abso - 
lute error criterion described in ISalmon fc Warreiil il994l) , 
which ranged from Altot / Ro per interaction at the start 
of the each simulation, to 10^^ Mtot/ Ro at the end. Plummer 
smoothing was used, with softening lengths of 7, 20 and 95 
comoving kpc for models 1, 2 and 3 respectively. The num- 
ber of timesteps for model 1 was 1475, 736 for model 2, and 
725 for model 3. Model 1 started at a redshift of 50, model 2 
at 44, and model 3 at 27. All particle masses were identical. 
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with the initial particle displacements imposed on a cubical 
lattice. 

All of these simulations are normalized to as = 0.9 
and have flm ~ 0.3, Qb ~ 0.04 and Hubble parameter 
h = 0. 7. They use realistic trans fer functions from CMB- 
FAST JSeliak fc ZaldarriagallToQ^ . erg = 0.9 corresponds to 
6^ = 4.624 X lO"'^ normalization in CMBFAST. In the fol- 
lowing all the results are for H0T2 whenever not explicitly 
specified otherwise. 

For the purpose of studying bias as a function of halo 
mass we also ran a suite of simulations varying one pa- 
rameter at a time. The box size for these simulations is 
192/i~^Mpc with 512^ particles. Their force and mass res- 
olution is the same as for HOT2. We ran the basic simu- 
lation with the same parameters as for HOTl-3, as well as 

= 0.2, as = 0.8, n = 0.9, dn/dlnk = -0.04 and h = 0.6, 
6 simulations in total. We used the same seed in random gen- 
erator for all of these cases to minimize the sampling errors. 
To investigate the bias at low mass end we supplemented 
these simulations with another run with 512^ particles in 
96/i"^Mpc. 

Finally, to investigate the bias at the high mass 
end we used additional simulations with 384'^ particles in 
768/i~^Mpc box, with standard parameters and erg = 0.90, 
erg = 0.775 and erg = 1.046. We also used another simulation 
with 700ft"^Mpc and 512^ for which erg = 0.767, fi™ = 0.27 
and h = 0.71. While this paper was undergoing the referee- 
ing process we finished another very large simulation with 
768/i~'^Mpc box and 1024^ particles, again with the stan- 
dard parameters. We use this simulation to verify the results 
obtained with other simulations, finding a good agreement 
among them. 



3 STOCHASTICITY OF DARK MATTER 

We begin by exploring the relation between the final dark 
matter density field and the initial density field, rescaled 
to 2: = using the linear growth factor. We Fourier trans- 
form both fields and denote individual modes with Si (k) and 
Sf(k) (we treat real and imaginary components as separate 
modes). Figure Q shows the ratio b(k) — 5f(k)/5i(k) for 
k < O.lh/Mpc. This is the scale at which one often assumes 
linear theory to be valid. We see that there are significant 
fluctuations between the initial and final field, suggesting 
that there are large corrections to the linear evolution even 
for k < O.lh/Mpc. 

We can define the ratio of the power spectra as 



Pfjk) _ {Sjjk)} 

m) (5?(k))' 



(1) 



where {) denotes ensemble average over different realizations 
of the universe. We can define relative rms fiuctuations in b 
as 



ab 



(2) 



This is related to the cross-correlation coefficient r, defined 
as 

(5,(k)5f(k)) 




k [h/Mpc] 

Figure 1. Ratio of final to initial density perturbations as a func- 
tion of wavemode amplitude k. There is a large scatter between 
the two quantities even on large scales, where linear theory is 
usually assumed to be valid. 



(4) 



r{k) 



V(5i(k)5i(k))(5f(k)5f(k))^ 



(3) 



The two are related via 
b 

Figure |5] shows ab/b as a function of wavemode k, where 
the average has been done over a large number of wave- 
modes so that r converges (at very low k this condition is 
not satisfied and r is biased high, which underestimates the 
rms fluctuations). We see that at/b changes from 10% at 
k ~ 0.02ft/Mpc to 40% at O.lh/Mpc, above which it rapidly 
increases and the two fields become incoherent. Figure|21also 
shows the ratio of nonlinear to linear power spectrum at 
z = 0, (6^(fc)). For k > 0.15h/Mpc the final power spectrum 
rapidly grows with k and exceeds the linear power spectrum, 
while for k < 0.15h/Mpc the final spectrum is slightly anti- 
biased on large scales, ie Pf{k) < Piik). We note that this 
effect is larger when small boxes are used. We discuss this 
further below. 

The main result arising from figures [TO is that the fiuc- 
tuations between the linear and nonlinear fields are large on 
large scales, despite the fact that the nonlinear power spec- 
trum is very close to the linear one. This is not so evident 
from the cross-correlation coefficient r itself, which can be 
close to 1 and still lead to large rms fluctuations: even for 
r = 0.995 the rms fiuctuations between initial and final field 
are 10% for any given mode. 

One can get some understanding of these results 
by using second orde r perturbation theory results (see 
iBernardeau et ani2002l . for a recent review). To compute 
the power spectrum to second order one must derive the 
density field to 3rd order, 5 = 5i -I- ^2 -I- ^3. Second order 
contributions to the power spectrum arise both from 5252 
and JiJa terms. The 5252 is the mode-mode coupling term, 
while the Ji^s is the nonlinear growth evolution term. These 
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Figure 2. Relative rms fluctuation cj/fe = [2(1 — r)]^/^ between 
tlie final and the initial density field for HOTS (solid). Also shown 
is the power spectrum ratio between the two fields (dashed). Re- 
sults for HOT2 are similar, but show a somewhat larger supression 
of nonlinear versus linear power spectrum for k < O.lh/Mpc. 

terms have different behaviour in various limits and have 
differing signs in the contribution: while 8262 is strictly pos- 
itive, SiSs has a negative component. For k < 0.15/i/Mpc 
the negative contribution wins and the second order correc- 
tion to the power spectrum is negative. At its peak around 
k ~ 0.1^/Mpc the correction is 5% and slowly decreases to- 
wards k ~* 0. While the correction is small on average, this 
is a result of a cancellation between positive and negative 
contributions. The dominant perturbative corrections come 
from the mode-mode couplings at wavelengths close to the 
wavelength of the mode itself: for k — 0.1/Mpc the dom- 
inant contribution to the positive component is from the 
modes around k ~ 0.05/Mpc, which contribute around a 
third of th e total correction or aroun d 5% of the final power 
spectrum jjain &: Bertschingerlll9s3) . These are long wave- 
length modes and in any finite volume there will be large 
statistical fiuctuations in their power relative to the true 
power. This leads to significant fiuctuations in the second 
order corrections depending on the actual realization of the 
mode amplitudes. Thus the final amplitudes of individual 
modes fiuctuate significantly relative to their initial values 
because of perturbative large scale effects. 

A similar effect is observed in the average nonlinear 
power spectrum, which is suppressed relative to linear one. 
The suppression of nonlinear power for k < 0.15h/Mpc rel- 
ative to linear one is dominated by long wavelength mode- 
mode coupling, so to get to a percent precision in sim- 
ulations one needs very large simulation boxes. We find 
that the difference between HOT2 (320/i^^Mpc) and HOTS 
(1152/i"^Mpc) is 5% in power at k=0.1h-lMpc, with larger 
HOTS simulation being closer to the linear power spectrum 
than HOT2. Thus while these mode-mode induced fluctua- 



tions are small compared to sampling variance for individ- 
ual modes, they are not small when one averages over many 
modes and may dominate the accuracy of amplitude deter- 
mination on large scales. 



4 STOCHASTICITY OF HALOS 

Galaxies are believed to form inside dark matter halos, 
which are virialized structures of high density. They can 
be labelled by their virial mass. Observations suggest that 
about 80% of the galaxies in a typical flux limited survey 
form at the centers of halos with masses ranging between 
10"/i"^Mq to 10^^/i"^A'/q, while the remaining 20% of 
galaxies are non-cent ral and occupy groups and clusters 
iCuzik fc Sell aki 120021 : ?). The exact radial distribution of 
galaxies inside halos and the form of the halo mass proba- 
bility distribution is the subject of a lot of current observa- 
tional and theoretical effort. Here we will use centers of dark 
matter halos as a proxy for galaxy positions. This will not 
give the correct correlation properties on small scales, where 
correlations between central and noncentral galaxies within 
the halos are important, but should be valid on large scales, 
where halos can be thought of as pointlike. We will show 
the results for a range of halo masses, which can be roughly 
thought as corresponding to galaxies with different lumi- 
nosities since the re is a tight relation between the halo mass 
and luminosity jMcKav et al.1 l200lt iGuzik fc SeliakI l2002l : 
?) . Alternatively, the different samples can be thought of as 
varying the flux limit of a survey, since going to fainter lim- 
its increases the number density of galaxies and thus reduces 
the shot noise and the same effect is achieved by going to 
fainter galaxies. 

Dark matter halos are identified from the simulations 
using the standard friends of friends algorithm with a link- 
ing length of 0.2. The resulting mass fu nctions agree well 
with the fitting formulae in the literature jSheth fc TormenI 
ll999l:IJenkins et alJlioOll) . We order them by mass and use 
subsamples separated roughly by a factor of 2 in mass. As in 
previous section we can define the halo fiuctuation 5h (k) and 
bias fe(k) = 5h(k)/5„i(k), as well as the cross-correlation co- 
efficient between the two fields (equation Figure |3 shows 
the relative rms fiuctuations at/b as a function of scale for 
several halo masses, relative to both the initial and the final 
density field. We show the case with and without the sub- 
traction of shot noise contribution to the halo power spec- 
trum (the dark matter power spectrum does not require shot 
noise subtraction because of large number of dark matter 
particles). The lines without shot noise subtraction are al- 
ways above the ones with subtraction and are the relevant 
ones if one is interested in the stochasticity between the 
halos and dark matter. The lower lines for which the shot 
noise has been subtracted show the remaining stochasticity 
which is not due to the shot noise. Because of the shot noise 
subtraction the cross-correlation coefficient can exceed 1, in 
which case we do not show the result. 

From figure |3] we see that the halos are even less well 
correlated to the initial density field than the dark matter is. 
The stochasticity begins at the level of 20% at low k, increas- 
ing to 50% at fc ~ 0.1/i/Mpc. The shot noise contribution 
to stochasticity is small for halos with high spatial density 
(low mass halos), but increases significantly for halos with 
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Figure 3. Relative rms fluctuation crj/fe = \J1(\ — r) between 
the halo density field and the initial (solid) or final (dashed) 
matter density field. Lower curves have been obtained by ap- 
plying the shot noise subtraction from the halo power spec- 
trum. Average masses are 4.5 X 10^^ h~^MQ (a), 10^^h~^MQ 
(b), 2 X lOi2?t-iM0 (c) and IO^/j-IMq (d). The correspond- 
ing halo densities are 7 X lO'^ft^/Mpc^, 2.7 X 10~^h^/Mpc^, 
1.5 X and 3.5 x IQ-^h^/Mpc^. 



low spatial density, as expected. There is no obvious differ- 
ence in the shot noise subtracted values, suggesting that the 
shot noise simply adds an additional component of stochas- 
ticity on top of that induced by nonlinearities in the relation 
between the halos and the initial density field. 

Correlation coefficient between the halos and the final 
density field is also shown in figure |3 (dashed lines) . Com- 
pared to the halo-initial field correlations the stochasticity 
is similar on the largest scales, but there is a better agree- 
ment between the halo and the final dark matter field on 
smaller scales {k > 0.1/i/Mpc). The cross-correlation coeffi- 
cient r would likely be even larger on small scales if we had 
modelled the galaxy distributions within the halos more re- 
alistically, since this would lead to an enhancement of corre- 
lations on small scales, similar to that seen in the dark mat- 
ter. Results from GfF simulations and analytic results us- 
ing halo models suggest that the cross-correlation coefficient 
can remain c lose to unity up to a fairly high k ~ Ih/Mpc 
iSeliakll200(i) . but this may not be generic and depends on 
the details of how galaxies are populated within the halos, 
which are quite uncertain. Observational evidence suggests 
that there is some sto chasticity on IMpc scale, with r ~ 0.5 
jHoekstra et al.ll2002l) . If r < 1 it would complicate the in- 
terpretation of the results based on the comparison between 
galaxy-galaxy correlations and galaxy-dark matter correla- 
tions, such as those from the galaxy-galaxy lensing analysis 
jSheldon et al.ll2003l) . Here we are more concerned with the 
correlations on large scales, k < O.lh/Mpc, where the de- 
tails of galaxy distribution within halos are not important 



and where direct observations are not yet available. The re- 
sults suggest that the fluctuations between halos and initial 
or final matter field are never below 10-20%. 

The dark matter distribution cannot be directly ob- 
served, so results shown in figure |21 are not directly applica- 
ble to any observational test. The closest example to a direct 
observation of the dark matter is through the weak lensing 
effect. Here the light from distant galactic sources is being 
distorted by the mass distribution along the line of sight. 
By averaging over the image distortions we can reconstruct 
the 2-d shear and convergence maps. These are given by the 
line of sight projection of the matter density, weighted by a 
radial function that is very broad. Correlations at a given an- 
gular scale receive dominant contributions from a transverse 
distance at half the distance to the source, but significant 
contributions are also coming from much smaller transverse 
separations produced by the mass distribution closer to the 
observer. 

It has been suggested bv iPenI (j200^ that if one cross- 
correlates the properly radially weighted galaxy field with 
the weak lensing maps then one determines the bias of galax- 
ies exactly if the two are perfectly correlated. Under these 
assumptions one can use the galaxy clustering information 
to determine the amplitude of dark matter fluctuations with 
higher accuracy than from the weak lensing itself, because 
the galaxy clustering can be done in 3-d (if redshifts are 
measured) and so one has more independent modes to re- 
duce the sampling variance compared to the 2-d analysis. For 
this method to work the correlation between the projected 
matter density and galaxy field must be close to perfect. 

To address this assumption one must correlate 2-d pro- 
jections of final dark matter and galaxies. While properly 
projected weak lensi ng 2-d maps have been constructed f rom 
N-body simulations Jjain et alj2000l:IWhite fc Hul200ol) . we 
take a simplifying approach here and cross-correlate the 2-d 
projections of the simulations along each of the 3 axes. The 
resulting rms scatter as a function of projected wavevec- 
tor k is found to be significantly larger than in 3-d case, 
a consequence of the projection effects, which cause shorter 
wavelength modes to contribute to longer wavelength modes 
in projection. For 10^^ h^^ AIq halos, which corresponds 
roughly to L* galaxies, we find that the rms scatter is 20% at 
k ~ 0.03/i/Mpc and 40% at ~ 0.1/i/Mpc. This is reduced 
by a factor of 2 if 10^^ Mq galaxies are used instead. This 
last example is shown in figure |1] In reality the stochasticity 
is likely to be larger for the lensing case, since projections at 
a fixed angle (rather than at a fixed transverse separation 
as done here) receive contributions from nearby small scale 
structures, for which the stochasticity between the galaxies 
and the dark matter will be much larger. 

We can estimate the effect of this scatter on the ampli- 
tude determination from the weak lensing cross-correlation 
analysis. If the lensing kernel peaks at 2 ~ 0.3 — 0.4 then 
k ~ O.lh/Mpc corresponds to I ~ 100. In a 200 square 
degree survey such as the upc oming CFHT Legacy Survey 
JVan Waerbeke fc MeUie3l2003 l we will have about 5 inde- 
pendent modes at Z ~ 30 and 50 at Z ~ 100. This means 
that for galaxies in 10^^/i~^Mq halos the overall linear am- 
plitude will have an error of 20%/y5 ~ 9% at ; ~ 30 and 6% 
at Z ~ 100, arising just from this effect (the power spectrum 
amplitude error will be twice as large). Additional errors of 
comparable magnitude will arise from the lensing noise and 
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Figure 4. Ratio of 2-d projected halo density perturbations 
(M = IO^/i-^Mq) to the initial density field as a function of 
wavemode amplitude k. The projections are along each of the 
three axes (288h-lMpc for HOT2 simulation used here). The 
scatter is larger than the corresponding 3-d case in figure 1^ 

projection effects. Such a poor determination of the growth 
factor as a function of redshift is unhkely to improve our cur- 
rent constraints on the dark energy significantly. This source 
of error was not included in the previous analysis 
and is much larger than the prognosticed errors without it. 
This complicates the prospects of this method for studies 
of dark energy through the growth factor evolution. The er- 
rors can be reduced with a larger survey area: for a survey 
covering 25% of full sky the errors on the power spectrum 
amplitude may approach 1% because more modes are being 
sampled and because the largest modes have the smallest 
amount of stochasticity. It remains to be seen whether this 
is ever competitive with a straight weak lensing analysis on 
smaller scales. 

As discussed in the introduction another method to de- 
termine the bias is to combine the clustering analysis of faint 
galaxies, for which we know the theoretical bias, with the lu- 
minous galaxies, for which we can measure the clustering on 
large scales with a small statistical error. To determine the 
relative bias between the populations we can simply com- 
pare the smoothed maps. In the absence of stochasticity be- 
tween the two galcixy populations one could determine the 
amplitude of mass fluctuations directly. Suppose that we 
want to determine the clustering amplitude of faint galax- 
ies, which are in low mass halos (around W^^ A4q for 
galaxies 2-3 magnitudes below L«) and L, galaxies, which 
typically occupy 10^^ Mq. Figure |S] shows the relative 
rms fluctuations in ratios of Fourier mode amplitudes be- 
tween halos of mass lO"/i~^Af0 and lO^^h'^MQ. We see 
again from figure|5]that the scatter is large. Both shot noise 
and stochasticity due to nonlinearities limit this method. 
The rms fluctuations between 10"/i"^Mq and 10^^/i"^Mq 



Figure 5. Ratio of halo density perturbations &h\l^hl ^ func- 
tion of wavemode amplitude fc. The halos are of mass 10^^/i~^Mq 
(hi) and IO^^/i-IMq (h2). 

halos are 8% at fc ~ 0.1/i/Mpc and 23% at fc ~ 0.2/i/Mpc. 
This is somewhat smaller than between the halos and the 
matter. Moreover, galaxies in redshift surveys provide 3-d 
information, so there are more large scale modes to reduce 
the scatter. Nevertheless, any attempt to determine the lin- 
ear bias using the cross-correlations must include this source 
of stochasticity in the analysis. 



5 HALO BIAS AS A FUNCTION OF MASS 

One of the important questions that can be addressed with 
these simulations is the relation between halo and dark mat- 
ter power spectrum as a function of halo mass. This relation 
has bee n theoretically predicted from the spheri cal collapse 
model (|Cole fc Kaiseilll989l: IMo fc Whit'3 Il 996'l and from 
the ellipsoidal collapse model JSheth et alj2001.) , which sug- 
gest that the halo bias is related to a derivative of the halo 
mass function. The relation has also been extracted from 
the numerical simulations, with a good quantitative agree- 
ment between the theoretical predictions and the si mula- 
tions over a range of different cosmological models jjind 
119981 ISheth fc Tormen| [l999) . Since these comparisons were 
done for a wide range of initial power spectra they were 
not specifically designed for realistic ACDM models and 
the predictions and simulations could differ by up to 20%. 
Moreover, the simulations used in previous work were based 
on 256^ particle simulations and did not have sufficient dy- 
namic range to sample the long wavelength modes and re- 
solve small halos at the same time. For the more massive 
halos shot noise is large, so the bias estimate is noisy. For 
halos close to the resolution threshold (typically of the or- 
der of 50-100 particles) some fraction of the halos may be 
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Figure 6. Ratio of halo to linear density field power spectrum 
as a function of wavevector k for halos of varying mass. At the 
bottom are the halos from HOTl simulation, next up are those 
from HOT2 and at the top are the HOTS halos. 

missed by the halo finder, leading to biased results in the 
bias determination as the low mass end. 

The simulations used in this paper are a significant im- 
provement over the previous generation. They contain 8- 
64 times more particles and cover a wide range of masses. 
We use HOTl simulation for halos in the mass range 
(5 X 10^° - 3 X 1O")M0, HOT2 for halos in the mass 
range (3 x lO" - 1O")M0 and HOT3 for halos in the mass 
range (10^3 - 10^^)Mq (the latter two are also checked with 
768/i~^Mpc box 1024^ particle simulation). 

Figure |S| shows the ratio of the (shot noise corrected) 
halo power spectrum to the linear mass power spectrum as a 
function of a wavevector k. One can see that the assumption 
of constant bias is reasonable for k < O.lh/Mpc and even 
beyond, so a linear bias can be defined as an appropriate av- 
erage over these modes. The exception are the most massive 
halos in HOT3 with b > 1.5, for which the power spectrum 
is suppressed already at fc ~ O.lh/Mpc due to the fact that 
the FOF halos do not overlap and so cannot be closer than 
two times the virial radius. Here we use all of the modes 
with k < O.lh/Mpc, except for the smallest 96/i~^Mpc sim- 
ulation where we use k < 0.15h/Mpc. We note that there 
is a good agreement between the simulations in the overlap 
mass range, but the larger simulation has smaller statistical 
errors. The smallest simulation (HOTl) has very few modes 
in the linear regime and the fluctuations in the ratio caused 
by perturbative effects beyond linear theory are large, so the 
bias determination from this simulation is somewhat less re- 
liable. On the other hand, all of the low mass halos in this 
simulation have almost the same bias and at the upper end 
of the mass range there is a good agreement in bias with 
halos of the same mass from HOT2. 

For simplification with theoretical comparisons we will 
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Figure 7. Bias as a function of mass in units of the non- 
linear mass. Points are from 96fe-lMpc 512^ (HOTl, green), 
144/i-lMpc 5123 (cyan), 192fe-lMpc 5123 (j-gj), 288/i-lMpc 
7683 (HOT2, blue), 1152/i-lMpc 7683 (HOT3, magenta) and 
768/i~^Mpc 10243 (black) simulations. Note that in several cases 
the points from two simulations overlap exactly. Upper (dashed 
blue) line is theoretical prediction from Sheth and Tormen (1999). 
Lower (solid black) line is the expression from equation ISl 



scale all the masses relative to the nonlinear mass M^i, de- 
fined as the mass within a sphere for which the rms fluc- 
tuation amplitude of the linear fleld is 1.68. While the 
theoretical predictions for the bias depend on the cosmo- 
logical model, most of that dependence is accounted for 
if the mass is expressed in terms of the nonlinear mass. 
For HOTl-3 simulations with erg = 0.9 and ^Irn ~ 0.3 at 
z — the nonlinear mass deflned with 1.68 overdensity is 
8.73 X 10^^ Mq. Figure |7| shows the bias determinations 
as a function of halo mass from the simulations used in this 
paper. The dash e d line is the theoretical prediction from 
ISheth fc TormenI (the fltting formula given in Ijind 

I^Q^T ls verv similar: while these fltting formulae are not 
very accurate we flnd a good agreement between the simu- 
lation results in these papers and our simulations). We see 
that these theoretical predictions overestimate the bias be- 
low Mni and are a good flt above M^i- The largest discrep- 
ancy is below A'/ni, where the relative error can be up to 
20%. The various simulations are in a reasonable agreement 
among themselves and the scatter between the points at the 
same mass is mostly due to the shot noise and small volume 
over which one is averaging. There may be some systematical 
error due to the fact that the nonlinear mass computed from 
the theoretical power spectrum can differ from the value ob- 
tained if one uses the actual realization. We flnd this can 
lead up to a 10% effect on nonlinear mass and would cause 
a horizontal shift by this amount. This is of almost no con- 
sequence for masses below Mni, where bias is only weakly 
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dependent on the mass, but may lead to a larger error at 
the high mass end. 

We find that the unbiased galaxies with 6 = 1 are at 
M = 1.5Mni and the bias is rapidly changing above O.lMni, 
while below this it is essentially constant with the value 
around 0.68. In all simulations we see bias increasing at the 
lowest masses (figure |HJ, which is a numerical artifact. For 
example, such an increase is seen in HOT2 at the low mass 
end and is not confirmed in HOTl, where the mass reso- 
lution improves by a factor of 8 (figure IHl- Moreover, this 
increase at low mass end changes into a decrease if we re- 
move unbound particles from the halos. To be safe we only 
present results where the difference between the two cases is 
less than 0.01. Note that in HOTl we find b ~ 0.65 at the low 
mass end. Even in the region of overlap with HOT2 the bias 
in HOTl is systematically lower by 0.03. This is likely to be 
due to the sampling variance in HOTl, as can be seen from 
figure |S| which shows considerable fiuctuations as a function 
of wavevector for this simulation. With a 144/i~^Mpc box 
512^ particle simulation we again find that b ~ 0.65 — 0.68 at 
the low mass end and that there is indeed significant scatter 
due to small box size (figure |7J. For this reason the empir- 
ical fit given below goes above HOTl at the low mass end. 
It is not entirely clear that this is the correct procedure, as 
H0T2 could have been already affected by the resolution, 
but the fact that both unbound and bound halos give the 
same result argues against this. 

While there is some uncertainty in the bias value at the 
low mass end, all simulations agree very well around nonlin- 
ear mass. In addition to H0T2 simulation we also have an- 
other 512^ simulation with 192/i~^Mpc box simulation and 
a 768/i~^Mpc box with 1024'^ particles simulation that both 
sample well this regime. At the high mass end uncertainty 
increases again because of a small number of high mass ha- 
los. In addition to HOT3 and 768/i"^Mpc box with 1024^ 
particles simulation we use another 768/i~^Mpc simulation 
with 384^ particles. 

The solid curve in figure |7| is an empirical expression 
that fits all simulations. Over the range between 10~^ < 
M /Mni < 10^ it is given by 

bo(x = M/M„i) = 0.53 + 0.39a;°-*^ + +5x lO^^'x-' ". 

40a; + 1 

(5) 

This expression should be accurate to about 3% or better 
for this model, as suggested from the scatter in figure |7| 

In figure |H| we show the bias as a function of mass for 
several simulations for which we varied one parameter at a 
time, roughly spanning the range of interest from cosmo- 
logical constraints today. We see that there is very little 
difference in the theoretical predictions for halo bias as a 
function of M/Mni, suggesting that instead of deriving full 
expressions, which depend on all cosmological parameters, 
one can simply use a single relation with mass in units of 
nonlinear mass, as in equation^ The deviations from this re- 
lat ion are qualitative l y cons istent with the predictions given 
bv ISheth fc TormenI ' (Il999ll . We can generalize the results 
from equation]^ by linearizing the bias relation in terms of 
cosmological parameters, 

b{x) = bo{x)+log^o{x)[OA{flm-0.3 + ns~l) 

+ 0.3(cr8 -0.9 + /1-0.7) +0.8as]. (6) 
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Figure 8. Bias as a function of mass in units of the nonlinear 
mass for several cosmological models. We varied one parameter 
at a time relative to the fiducial concordance model, roughly cov- 
ering the range of interest. This figure shows that the bias predic- 
tions depend predominantly on the nonlinear mass, while other 
cosmological parameters play only a minor, but not entirely neg- 
ligible, role. 

This correction should be reasonable for 1 > x > 0.1, 
while below that the correction appears to saturate at 
-OAlflm - 0.3 + ns-l]- 0.3[as -0.9 + h- 0.7] - O.Sas. For 
massive halos with M > M^i {x > 1) the differences among 
models in the bias predictions from lSheth fc TormenI ^ll999^ 
become larger, but this is difficult to observe in these simu- 
lations, where the number of such halos is small and the bias 
measurements have large shot noise. In this regime the ana- 
lytic predictions may be more accurate than equation |3 we 
do not see much evidence against the analytic expressions 
from our comparisons (figure |7| and analytic expressions 
can be more easily generalized to more general cosmological 
models. Note however that the simulations used in this pa- 
per improve upon the previous generation simulations over 
this regime as well. 



6 CONCLUSIONS 

In this paper we have addressed the relations between the 
matter density field, halos and initial density field, focusing 
on large scales where these are often assumed to be propor- 
tional to each other. We focus on two issues. First, what is 
the scatter between these fields around the average relation? 
This is expressed here in terms of relative scatter between 
the mode amplitudes, which is related to the stochasticity 
parameter r, defined as the cross-correlation coefficient be- 
tween the two fields. While the two are related we emphasize 
that even small deviations of r from unity may lead to large 
relative fiuctuations between the two fields. These are of 
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interest whenever one is trying to relate the fields to one an- 
other to determine their relative amplitudes. One example is 
the bias determination using the cross-correlation between 
the weak lensing signal (tracing the matter density) and the 
galaxies. Another example is the relative bias determination 
between two different galaxy populations, which we propose 
here as an alternative method to determine the galaxy bias, 
because galaxies in low mass halos have a bias of 6 ~ 0.7 
independent of their mass. In all cases we find the scatter 
between the fields in individual modes is significant and one 
cannot assume the fields are simply proportional one to an- 
other. This scatter, coupled with a small number of modes 
on large scales, makes it difficult to accurately determine 
the bias (or relative bias) and needs to be included in the 
predictions of how accurately can one determine the matter 
power spectrum with these methods. 

The second goal of this paper was to revisit the halo bias 
as a function of halo mass. This relation is a fundame ntal 
ingredient of any halo model (see lCoorav fc Shethl20o'^ for 
a recent review) and plays an important role if one is trying 
to model galaxy clustering by connecting it to t he underly- 
ing halos. The previous generation of simulations lljinell998l : 
[ Sheth_fc_ Toymen 1999) had a limited dynamical range and 
the predictions were not tuned specifically for ACDM mod- 
els. As a result the existing expressions overestimate the bias 
by as much as 20% in the range below the nonlinear mass, 
which is likely to be the mass range for halos that host most 
of the galaxies. We propose a new expression that fits the 
simulations better. We argue that this expression should be 
fairly accurate for other cosmological models of interest as 
well, as long as the mass is expressed in units of nonlin- 
ear mass. We give corrections for small deviations from this 
model. The overall accuracy on bias-halo mass relation is at 
the level of 0.03 or better (for b < 1), which should help 
with the bias determination from the current generation of 
observations. 
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